78 research outputs found

    Fast recursive matrix multiplication for multi-core architectures

    Get PDF
    AbstractIn this article, we present a fast algorithm for matrix multiplication optimized for recent multicore architectures. The implementation exploits different methodologies from parallel programming, like recursive decomposition, efficient low-level implementations of basic blocks, software prefetching, and task scheduling resulting in a multilevel algorithm with adaptive features. Measurements on different systems and comparisons with GotoBLAS, Intel Math Kernel Library (IMKL), and AMD Core Math Library (AMCL) show that the matrix implementation presented has a very high efficiency

    Diagonal - implicity iterated Runge-Kutta methods on distributed memory multiprocessors

    Get PDF
    We investigate the parallel implementation of the diagonal-implicitly iterated Ruge-Kutta (DIIRK) method, an iteration method based on a predictor-corrector scheme. This method is appropriate for the solution of stiff systems of ordinary differential equations (ODEs) and provides embedded formulae to control the stepsize. We discuss different strategies for the implementation of the DIIRK method on distributed memory multiprocessors which mainly differ in the order of independent computations and the data distribution. In particular, we consider a consecutive implementation that executes the steps of each corrector iteration in sequential order and distributes the resulting equation systems among all available processors, and a group implementation that executes the steps in parallel by independent groups of processors. The performance of these implementations depends on the right hand side of the ODE system: For sparse functions, the group implementations is superior and achieves medium range seedup values. For dense functions, the consecutive implementation is better and achieves good speedup values.

    Multi-Criteria Decision Support for Manufacturing Process Chains

    Get PDF
    During the manufacturing planning, multiple variants of process chains for the manufacturing of a product to be developed are generated by engineers. In order to select an optimal variant, multiple decision criteria specifying technical, ecological and economical properties of the process chains as well as multiple assessments of different domain experts have to be taken into account. The contribution of this article is a two-step approach that provides a multi-criteria multi-expert assessment of manufacturing process chains supporting the selection of an optimal process chain. A web-based software tool that implements the multi-criteria assessment of process chains is also presented

    Exploiting Heterogeneous Compute Resources for Optimizing Lightweight Structures

    Get PDF
    Proceedings of: Second International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2015). Krakow (Poland), September 10-11, 2015.Optimizing lightweight structures with numerical simulations leads to the development of complex simulation codes with high computational demands. The optimization approach for lightweight structures consisting of fiberreinforced plastics is considered. During the simulated optimization, independent simulation tasks have to be executed efficiently on the heterogeneous computing resources. In this article, several scheduling methods for distributing parallel simulation tasks among compute nodes are presented. Performance results are shown for the scheduling and execution of synthetic benchmark tasks, matrix multiplication tasks, as well as FEM simulation tasks on a heterogeneous compute cluster.This work was performed within the Federal Cluster of Excellence EXC 1075 “MERGE Technologies for Multifunctional Lightweight Structures” and supported by the German Research Foundation (DFG)

    Parallel iterated Runge-Kutta methods and applications

    Get PDF
    The iterated Runge-Kutta (IRK) method is an iteration scheme for the numerical solution of initial value problems (IVP) of ordinary differential equations (ODEs) that is based on a predictor-corrector method with an Runge-Kutta (RK) method as corrector. Embedded approxination formulae are used to control stepsize. We present different parallel algorithms of the IRK method on distributed memory multiprocessors for the solution of systems of ODEs. The parallel algorithms are given in an SPMD (single-program multipledata) programming style where data exchanges are described with appropriate communication primitives. A theoretical performance analysis and a runtime simulation allow to value the presented algorithms. The implementation on the Intel iPSC/860 confirms the predicted runtimes. The speedup values strongly depend on the particular system of ODEs to be solved. The parallel IRK method is applied to a typical discretization problem, the discretized Brusselator equation. Application specific modifications of the general parallel ODE solver are developped which result in a considerable reduction of the parallel execution time.

    Parallele Strategien fĂĽr ein spektrales Wolkenmodul in einem 3-dimensionalen Mesoskalenmodell

    Get PDF
    A spectral cloud model is developed for a 3-dimensional mesoscale model considering only the microphysical conversion processes of the warm cloud. Because of the expected computation requirements, which are strongly increased in relation to the bulk-parameterization, we develop concepts for the parallelization of the module, explain their applicability and present first results.Für ein 3-dimensionales Mesoskalenmodell wird ein spektrales Wolkenmodul entwickelt, das zunächst nur die mikrophysikalischen Umwandlungsprozesse der warmen Wolke berücksichtigt. Aufgrund des zu erwartenden, im Vergleich zur bulk-Parametrisierung stark erhöhten Rechenzeitbedarfs entwickeln wir Konzepte zur Parallelisierung des Moduls, erläutern deren Anwendbarkeit und stellen erste Ergebnisse vor

    HeteroPar 2014, APCIE 2014, and TASUS 2014 Special Issue

    Get PDF
    International audienceThis is the editorial of the special issue of the HeteroPar 2014, APCIE 2014, and TASUS 2014 workshop

    HeteroPar 2014, APCIE 2014, and TASUS 2014 Special Issue

    Get PDF
    International audienceThis is the editorial of the special issue of the HeteroPar 2014, APCIE 2014, and TASUS 2014 workshop
    • …
    corecore